Precision Software Applications Silver Collection 1

home *** CD-ROM | disk | FTP | other *** search

/ Precision Software Appli…tions Silver Collection 1 / Precision Software Applications Silver Collection Volume One (PSM) (1993).iso / tutor / asm1tut.exe / CHAP25.DOC < prev next >

Wrap

Text File | 1990-07-31 | 24KB | 487 lines

268 CHAPTER 25 - WHAT DOES IT ALL MEAN? What does it all mean? Not a whole lot, actually. We can now save memory space and we can save a lot of time. But with the incessant march of technology these things mean less and less. A few years ago, when 64k or 128k was a lot of memory and memory was expensive, having a 20k program instead of a 40k program was a significant advantage. Now it means almost nothing unless it is a memory resident program. What about disk space? Just a while back we were operating with two 360k floppy disks and a hard disk was too expensive. Nowdays everyone has a 20meg hard disk.{1} And speed? Those programs that were slow on an 8088 now seem o.k. on an 80386. Those programs that were unbearably slow on the 8088 died a quick death and are no longer around. Compilers are better and they have more subroutines available. They are also easier to program than going to the assembler level. What this chapter is about is when NOT to use the standard compiler functions and subroutines. First, you should understand that all compiler subroutines are general purpose subroutines. They need to be all things to all people. Imagine what a vehicle would be like if we gave the designer the following specification: We want to be able to drive to the store for groceries. It should be fuel efficient. In case we want to go into the mountains it should be an all terrain vehicle. We also want to be able to haul a roomful of furniture from coast to coast. Oh yes, and we want to be able to race it at Le Mans. Being universal requires a lot of code and it slows things down. Whether this extra code and time is too much is a question you need to decide for yourself. First, here are some examples of size. This is a C program that does almost nothing: #include <stdio.h> main() { int x ; x = 27 ; /* line 1 */ scanf ( "%d", &x ) ; /* line 2 */ printf ( "%d\n", x ) ; /* line 3 */ } ____________________ 1. Which has led to one of my pet peeves. All installation programs for compilers and word processors dump EVERYTHING on the hard disk. This gives us subdirectories that have 50 files in them, and we don't have the foggiest notion of what any of the files are for. If these installation programs would only prompt us by type of file to find out what we want to install and want to leave off the hard disk, we would all be better off. ______________________ The PC Assembler Tutor - Copyright (C) 1990 Chuck Nelson Chapter 25 - What Does It All Mean? 269 ___________________________________ I have made 3 programs from this. LINE1.C has line1 only. LINE2.C has lines 1 and 2. LINE3.C has lines 1, 2 and 3. For those non C people, scanf is an input function, printf is an output function. Guess how big each program is. Here's the directory listing. LINE1 EXE 3176 6-22-90 8:48a LINE2 EXE 7170 6-22-90 8:49a LINE3 EXE 9134 6-22-90 8:49a It takes 3000 bytes to start a C program (this is the startup module) 4000 bytes more to enter something and 2000 bytes extra to print something. That first 3000 bytes is unavoidable if you are writing in a high level language. If you are doing a lot of general purpose i/o, these extra amounts aren't too bad. There are two cases where you might want to use your own i/o routines. First, if you have something simple or secondly, if you have something special, you want to do your own i/o. If you don't need all that flexibility, you are better off doing your own i/o. Here are two files that write a text screen to a disk file. COPYSCRN EXE 10454 6-10-90 9:30a INTSCRN COM 445 6-12-90 7:40p They both do the same thing except that the .COM file is a little more sophisticated. Notice the difference in size. Speed really doesn't play a part here because what they do is so simple that it takes just a second in any case. The program was so simple that it only took an hour or two to write, so I didn't lose any time by writing it in assembler. The other case is when you have a specific idea of what the screen should look like. You want control of the whole screen all the time. This includes all word processors, databases, programming environments, etc. They all take charge of the screen because some DOS functions are too slow. If you remember from the ZOOM chapter, there is a radical difference between what you can do and what DOS can do. Even though these large programs are written in C, they all bypass the C i/o functions. That does not mean that they go down to the assembler level, however. INTERRUPTS You have done a few interrupts. They call the standard DOS or BIOS functions. Remember, they do this by going into low memory (the first 1k of memory) and getting the address of the subprogram that handles that particular interrupt. However, you do not need to call these interrupts from the assembler level. All modern compilers support interrupt calls in the language. If yours doesn't, you need a more recent compiler. Before going on with this chapter you need to read your compiler documentation about interrupts. TURBO Pascal has INTR, QuickBASIC and QuickC have INT86. Read the documentation now. The PC Assembler Tutor 270 ______________________ Have you read it? No cheating is allowed, because you won't understand the rest of this if you haven't read it. Though technically C is a structure and Pascal is a record, they are actually arrays where each array element has a specific name. The interrupt routine reads all these values into the corresponding register, calls the interrupt, then reads the register values back into the array. Some languages have one array for the input and another for the output. Int 21h is special so QuickC has a special function called INTDOS. It is the same as using Int 21h. The order of registers in the array is arbitrary and language dependent. For TurboPascal it is (AX, BX, CX, DX, BP, SI, DI, DS, ES, FLAGS). You enter values in the registers specified by the interrupt, and then call the interrupt. The routine does the rest. INTR ( int_no: byte, var the_regs: Registers) This will push the interrupt number, then the array address. On entry to the interrupt call and after initializing BP, we will have: int_no bp + 6 array_address bp + 4 old IP bp + 2 bp -> old BP bp What follows is not the exact code, but is similar to what the Pascal routine does: ; - - - - - - - - - - intr proc near push bp mov bp, sp push ax ; save all registers except SP, DS, SS, CS push bx push cx push dx push si push di push bp ; this is OUR bp push es ; insert the interrupt number in the interrupt mov al, [bp+6] ; AL now contains the interrupt number lea si, interrupt_spot ; where the interrupt is mov cs:[si+1], al ; insert it in the interrupt ; change all the registers mov si, [bp+4] ; array address is DS:SI mov ax, [si] mov bx, [si+2] mov cx, [si+4] mov dx, [si+6] mov bp, [si+8] Chapter 25 - What Does It All Mean? 271 ___________________________________ mov di, [si+12] mov es, [si+16] ; special manipulation for DS and SI push ds ; save ds push si ; save si push ax ; temp save of ax from array mov ax, [si+14] ; ds from array to ax mov si, [si+10] ; si from array to si mov ds, ax ; now move ax to ds pop ax ; restore ax ; call the interrupt interupt_spot: int 0 ; dummy number for the interrupt ; special needs for SI and DS ; our SI and DS are at the top of the stack ; save values of flags, si and ds from interrupt pushf ; value from interrupt push si ; value from interrupt push ds ; value from interrupt add sp, 6 ; get to our si and ds pop si ; our si pop ds ; our ds sub sp, 10 ; sp is where it was a moment ago. mov [si], ax ; DS:SI points to array mov [si+2], bx mov [si+4], cx mov [si+6], dx mov [si+8], bp mov [si+12], di mov [si+16], es pop [si+14] ; ds from the interrupt pop [si+10] ; si from the interrupt pop [si+18] ; flags from the interrupt add sp, 4 ; skip our DS and SI (already in regs) pop es pop bp ; this is OUR bp pop di pop si pop dx pop cx pop bx pop ax mov sp, bp pop bp ret (4) ; clear arguments off the stack ; - - - - - - - - - - - - - - - - - - - - This should test your insight into using code. DS and SI are needed for moving data, so we use some kludges to get it to work. There are two things here that you shouldn't normally do. First, The PC Assembler Tutor 272 ______________________ we are inserting the interrupt number directly in the machine code. Secondly, we are playing around with the value of SP. These are rare exceptions and shouldn't occur in your own code unless absolutely necessary. The first thing you are going to say is, "Gosh, that's a lot of code for one interrupt." True, especially when the interrupt is interrupt 12h. Here's int 12h inside of our template file: ; - - - - - START CODE HERE int 12h ; machine memory (return in ax) call print unsigned ; - - - - - END CODE HERE It finds out how much memory your computer has and returns the number of kbytes in AX. But how much extra time does using this Pascal interrupt routine take? About 700 clocks or about .0002 seconds (that's right, 2 ten thousandths) on the slowest machine. How many times will you call it during a program? Only one time. There is no point in going down to the assembler level to write a program that saves you .0002 seconds. In Pascal, you would write: INTR ( $12 , the_regs) ; and be done with it. No big loss of time and no trouble at all. In fact, as far as I can see, there is no reason for doing any interrupts from the assembler level. You may want to do a whole subprogram that contains interrupts, but if you just need one or two interrupts, it is easier to work from inside the high-level language. This includes the i/o we were talking about a minute ago. Yoy can write a screen program inside a high level language using arrays. Just think of a screen as a 80X25 array. If a two dimensional array is too slow you need to go to a one dimensional array. All interrupts that tell what kind of video card is in the computer, what mode the screen is in, etc. can be done from the high-level language. The most you need assembler for (depending on the language) is moving the text array into video memory. You want a bunch of help screens? Put all the help screens in a single file and use the interrupt for random access file read to read a screen when you need it.{2} Anything else? Yes, we still have the need for speed. There are certain types of operations like block moves of data, word searches and sorting of arrays that are characterized by large amounts of data and/or large amounts of computation. If you think you see a way to use registers effectively for one of these ____________________ 2. What you actually want to do is have the first block of data in the file tell you where each screen is and how long its data is. Then the first 2 bytes or words of the screen data should say the dimensions of the screen data ( 12 X 25, 17 X 3, etc.). This will allow you to store and use screens of any size. Chapter 25 - What Does It All Mean? 273 ___________________________________ things, you probably can beat a compiled version of the subprogram. Then the only question is whether or not it is worth the trouble. We have used the words "fast" and "slow" ambiguously so far, but now it is time to quantify them. Before you get the numbers, you need to know one thing about memory. People always talk about the "data bus". What is it? It is a group of wires connecting the 80x86 chip to memory. The 8088 has 8 wires, the 8086, 80286 and 80386/SX have 16 wires, and the 80386 has 32 wires. That means that the 8088 can transfer 8 bits of information at one time, the 8086 et. al. can transfer 16 bits at a time and the 80386 can transfer 32 bits at a time. This means one byte, two byte and four byte transfers respectively. This also means that the memory bytes are ordered a little differently. You will never notice it externally, but here is the different internal ordering. The 8088 has all bytes one after the other. All memory read/writes are done with the same 8 wires: 8088 MEMORY ADDRESSES 00005 00004 00003 00002 00001 00000 data lines |||||||| (8 bits) (All our examples will use absolute memory locations starting at 00000). The chips with a 16 bit data bus have all the even locations on the first 8 wires and the odd locations on the other 8 wires. They come in pairs - first even then odd: 8086 MEMORY ADDRESSES 00006 00007 00004 00005 00002 00003 00000 00001 data lines |||||||| |||||||| (16 bits) When one of these chips reads or writes, it can read/write either the left or the right byte or the whole word. What it cannot do is read the right byte from one pair along with the left byte from another pair. If you want to read the word at 00005:00006, the 8086 must: 1) read the 00005 byte. 2) read the 00006 byte. 3) join them together. This takes longer than just a single word read. The PC Assembler Tutor 274 ______________________ The true 80386 has a 32 bit data bus. This allows it to read 4 bytes at a time, and its physical memory structure looks like this: 80386 MEMORY ADDRESSES 00010 00011 00012 00013 0000C 0000D 0000E 0000F 00008 00009 0000A 0000B 00004 00005 00006 00007 00000 00001 00002 00003 data lines |||||||| |||||||| |||||||| |||||||| (32 bits) Instead of memory pairs, we now have memory quadruplets. As long as a word is totally inside of one quadruplet, the read/write time will be unaffected. If the read/write crosses the boundary (as we did above), the read/write time will be affected in the same way. The 80386 can also read 4 byte data quickly as long as the total data is inside of one memory quadruplet. In the 8086 family, data can always be read across these boundaries but it takes more time. (On the IBM 370, on the other hand, there are instructions that REQUIRE that data be aligned along 32 bit boundaries). This means you should order your data in the following way in the data segment: QWORD DATA DWORD DATA TBYTE DATA ; this is for the 8087 WORD DATA BYTE DATA ; all strings, etc. This insures that any read/write for that type of data will always be as fast as possible. If the segment definition has no alignment type, it will start on a paragraph boundary - i.e. every 16 bytes, and will work with anything. {3} In addition, if you ever subtract a number from SP to provide for a temporary data area, it should always be an even number. If SP is at an odd address instead of an even address, it takes longer for PUSHes and POPs. Also, when you define the size of the stack segment, it should be an even number of bytes. Having said that, it is now time for you to see the speeds of ____________________ 3. The alignment type is a word after the word SEGMENT which says how the segment should be aligned. The following: DATASTUFF SEGMENT BYTE PUBLIC 'DATA' says the segment can be aligned at any byte. The allowable forms are BYTE, WORD, DWORD, PARA, PAGE (256 bytes). If there is no explicit type, the default is PARA. Chapter 25 - What Does It All Mean? 275 ___________________________________ instructions. Read the introduction to APPENDIX III, then glance at the times to get the general idea of how fast times are. Come back to this chapter when you are comfortable with what the times look like. Have you read APPENDIX III? If not, do it before going on. The compiled languages all have one thing in common. They tell you that if you are writing a subroutine, you need to return from the subroutine with DS, BP, SS, and SP unchanged. They don't say a thing about any of the other registers. One thing this tells us is that they are doing everything from memory locations, not register locations. If you have taken a good look at the execution times, you will have noticed the phenomenal difference in time between a "memory, register" addition and a "register, register" addition. Now, if all you are going to do is move a number to a register, add it, and move it out again, a compiler can do it as fast as you can. But, if you run into a situation where you can use three or four registers at the same time, you can cut the execution time drastically. Compilers really can't use registers as efficiently as we can (yet). This is an ideal spot for using assembly language. The old adage that 10% of the code uses 90% of the computer time is appropriate here. You now know about assembler language, and you know what you want to do with it, so go out and enjoy. But before you do, try to slog your way through the next chapter on "simplified" segment definitions and linking to high level languages.